Overview

Dataset statistics

Number of variables25
Number of observations665249
Missing cells279371
Missing cells (%)1.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory126.9 MiB
Average record size in memory200.0 B

Variable types

CAT11
NUM9
BOOL5

Warnings

time has a high cardinality: 1204 distinct values High cardinality
age_youngest is highly correlated with age_oldestHigh correlation
age_oldest is highly correlated with age_youngestHigh correlation
risk_factor has 240418 (36.1%) missing values Missing
C_previous has 18711 (2.8%) missing values Missing
duration_previous has 18711 (2.8%) missing values Missing
day has 140539 (21.1%) zeros Zeros
duration_previous has 24926 (3.7%) zeros Zeros

Reproduction

Analysis started2022-03-22 17:04:46.899273
Analysis finished2022-03-22 17:07:21.102110
Duration2 minutes and 34.2 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

customer_ID
Real number (ℝ≥0)

Distinct97009
Distinct (%)14.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10076553.44
Minimum10000000
Maximum10152724
Zeros0
Zeros (%)0.0%
Memory size5.1 MiB
2022-03-22T13:07:21.198853image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10000000
5-th percentile10007770
Q110038523
median10076403
Q310114696
95-th percentile10145255.6
Maximum10152724
Range152724
Interquartile range (IQR)76173

Descriptive statistics

Standard deviation44049.77859
Coefficient of variation (CV)0.004371512428
Kurtosis-1.195788963
Mean10076553.44
Median Absolute Deviation (MAD)38079
Skewness-0.002009957296
Sum6.7034171e+12
Variance1940382994
MonotocityIncreasing
2022-03-22T13:07:21.354445image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1012914713< 0.1%
 
1002876113< 0.1%
 
1000739813< 0.1%
 
1010739613< 0.1%
 
1010385113< 0.1%
 
1011068813< 0.1%
 
1008934513< 0.1%
 
1015075213< 0.1%
 
1006339413< 0.1%
 
1007554713< 0.1%
 
Other values (96999)665119> 99.9%
 
ValueCountFrequency (%) 
100000009< 0.1%
 
100000056< 0.1%
 
100000078< 0.1%
 
100000134< 0.1%
 
100000146< 0.1%
 
ValueCountFrequency (%) 
101527246< 0.1%
 
101527233< 0.1%
 
101527216< 0.1%
 
101527208< 0.1%
 
101527189< 0.1%
 

shopping_pt
Real number (ℝ≥0)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.219965757
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Memory size5.1 MiB
2022-03-22T13:07:21.481073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile8
Maximum13
Range12
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.394368765
Coefficient of variation (CV)0.5673905674
Kurtosis-0.5640492606
Mean4.219965757
Median Absolute Deviation (MAD)2
Skewness0.4697521391
Sum2807328
Variance5.733001784
MonotocityNot monotonic
2022-03-22T13:07:21.578946image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
19700914.6%
 
29700914.6%
 
39700914.6%
 
49144113.7%
 
58344012.5%
 
67217110.8%
 
7565488.5%
 
8379585.7%
 
9207103.1%
 
1087251.3%
 
Other values (3)32290.5%
 
ValueCountFrequency (%) 
19700914.6%
 
29700914.6%
 
39700914.6%
 
49144113.7%
 
58344012.5%
 
ValueCountFrequency (%) 
1350< 0.1%
 
125250.1%
 
1126540.4%
 
1087251.3%
 
9207103.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
0
568240 
1
97009 
ValueCountFrequency (%) 
056824085.4%
 
19700914.6%
 
2022-03-22T13:07:21.648656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

day
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.969429492
Minimum0
Maximum6
Zeros140539
Zeros (%)21.1%
Memory size5.1 MiB
2022-03-22T13:07:21.894999image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4534705
Coefficient of variation (CV)0.7380160122
Kurtosis-1.168196957
Mean1.969429492
Median Absolute Deviation (MAD)1
Skewness0.1306439815
Sum1310161
Variance2.112576494
MonotocityNot monotonic
2022-03-22T13:07:21.980771image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
014053921.1%
 
113692120.6%
 
213345320.1%
 
412363918.6%
 
312134218.2%
 
583781.3%
 
69770.1%
 
ValueCountFrequency (%) 
014053921.1%
 
113692120.6%
 
213345320.1%
 
312134218.2%
 
412363918.6%
 
ValueCountFrequency (%) 
69770.1%
 
583781.3%
 
412363918.6%
 
312134218.2%
 
213345320.1%
 

time
Categorical

HIGH CARDINALITY

Distinct1204
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
14:35
 
1553
14:59
 
1489
14:57
 
1471
15:22
 
1454
15:20
 
1453
Other values (1199)
657829 
ValueCountFrequency (%) 
14:3515530.2%
 
14:5914890.2%
 
14:5714710.2%
 
15:2214540.2%
 
15:2014530.2%
 
15:0914520.2%
 
14:4014410.2%
 
14:1614400.2%
 
14:3914350.2%
 
15:1314320.2%
 
Other values (1194)65062997.8%
 
2022-03-22T13:07:22.118401image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique99 ?
Unique (%)< 0.1%
2022-03-22T13:07:22.240075image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length5
Min length5

state
Categorical

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
FL
106287 
NY
91627 
PA
60677 
OH
44537 
MD
 
28443
Other values (31)
333678 
ValueCountFrequency (%) 
FL10628716.0%
 
NY9162713.8%
 
PA606779.1%
 
OH445376.7%
 
MD284434.3%
 
IN252953.8%
 
WA251883.8%
 
CO244093.7%
 
AL235603.5%
 
CT193532.9%
 
Other values (26)21587332.4%
 
2022-03-22T13:07:22.361750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:22.475416image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

location
Real number (ℝ≥0)

Distinct6248
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12271.54302
Minimum10001
Maximum16580
Zeros0
Zeros (%)0.0%
Memory size5.1 MiB
2022-03-22T13:07:22.588145image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10174
Q110936
median12027
Q313426
95-th percentile15119
Maximum16580
Range6579
Interquartile range (IQR)2490

Descriptive statistics

Standard deviation1564.789415
Coefficient of variation (CV)0.1275136641
Kurtosis-0.7368574075
Mean12271.54302
Median Absolute Deviation (MAD)1215
Skewness0.4780995787
Sum8163631724
Variance2448565.912
MonotocityNot monotonic
2022-03-22T13:07:22.718765image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1008310940.2%
 
102139770.1%
 
103489740.1%
 
111968690.1%
 
115178640.1%
 
100307980.1%
 
120917310.1%
 
107237030.1%
 
107937010.1%
 
102456820.1%
 
Other values (6238)65685698.7%
 
ValueCountFrequency (%) 
10001238< 0.1%
 
1000292< 0.1%
 
10003277< 0.1%
 
10004255< 0.1%
 
10005296< 0.1%
 
ValueCountFrequency (%) 
165808< 0.1%
 
1657910< 0.1%
 
165788< 0.1%
 
165775< 0.1%
 
165768< 0.1%
 

group_size
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
1
519305 
2
136393 
3
 
8856
4
 
695
ValueCountFrequency (%) 
151930578.1%
 
213639320.5%
 
388561.3%
 
46950.1%
 
2022-03-22T13:07:22.840439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:22.909256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:22.986050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

homeowner
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
1
356726 
0
308523 
ValueCountFrequency (%) 
135672653.6%
 
030852346.4%
 
2022-03-22T13:07:23.049880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

car_age
Real number (ℝ≥0)

Distinct67
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.139436512
Minimum0
Maximum85
Zeros5805
Zeros (%)0.9%
Memory size5.1 MiB
2022-03-22T13:07:23.132658image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median7
Q312
95-th percentile18
Maximum85
Range85
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.764598035
Coefficient of variation (CV)0.7082306037
Kurtosis4.106639801
Mean8.139436512
Median Absolute Deviation (MAD)4
Skewness1.223618387
Sum5414752
Variance33.2305905
MonotocityNot monotonic
2022-03-22T13:07:23.260349image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
17165010.8%
 
2503897.6%
 
7467527.0%
 
8444536.7%
 
6443636.7%
 
9442186.6%
 
4409916.2%
 
3403916.1%
 
10393015.9%
 
11361735.4%
 
Other values (57)20656831.1%
 
ValueCountFrequency (%) 
058050.9%
 
17165010.8%
 
2503897.6%
 
3403916.1%
 
4409916.2%
 
ValueCountFrequency (%) 
854< 0.1%
 
756< 0.1%
 
7418< 0.1%
 
655< 0.1%
 
646< 0.1%
 

car_value
Categorical

Distinct9
Distinct (%)< 0.1%
Missing1531
Missing (%)0.2%
Memory size5.1 MiB
e
219251 
f
177204 
d
113174 
g
98152 
h
28976 
Other values (4)
26961 
ValueCountFrequency (%) 
e21925133.0%
 
f17720426.6%
 
d11317417.0%
 
g9815214.8%
 
h289764.4%
 
c208203.1%
 
i36030.5%
 
b14020.2%
 
a11360.2%
 
(Missing)15310.2%
 
2022-03-22T13:07:23.380028image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:23.454977image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:23.564535image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length1
Mean length1.004602788
Min length1

risk_factor
Categorical

MISSING

Distinct4
Distinct (%)< 0.1%
Missing240418
Missing (%)36.1%
Memory size5.1 MiB
3
117571 
4
110754 
1
99476 
2
97030 
ValueCountFrequency (%) 
311757117.7%
 
411075416.6%
 
19947615.0%
 
29703014.6%
 
(Missing)24041836.1%
 
2022-03-22T13:07:23.671257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:23.738073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:23.813869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

age_oldest
Real number (ℝ≥0)

HIGH CORRELATION

Distinct58
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.99240284
Minimum18
Maximum75
Zeros0
Zeros (%)0.0%
Memory size5.1 MiB
2022-03-22T13:07:23.923588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile22
Q128
median44
Q360
95-th percentile75
Maximum75
Range57
Interquartile range (IQR)32

Descriptive statistics

Standard deviation17.40343996
Coefficient of variation (CV)0.3868084134
Kurtosis-1.228547547
Mean44.99240284
Median Absolute Deviation (MAD)16
Skewness0.2257477638
Sum29931151
Variance302.8797225
MonotocityNot monotonic
2022-03-22T13:07:24.048242image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
75452116.8%
 
24216273.3%
 
23214893.2%
 
25212783.2%
 
22196923.0%
 
26186792.8%
 
27160462.4%
 
28151802.3%
 
21142552.1%
 
29131792.0%
 
Other values (48)45861368.9%
 
ValueCountFrequency (%) 
1816660.3%
 
1965921.0%
 
20101801.5%
 
21142552.1%
 
22196923.0%
 
ValueCountFrequency (%) 
75452116.8%
 
7456460.8%
 
7358610.9%
 
7261940.9%
 
7168251.0%
 

age_youngest
Real number (ℝ≥0)

HIGH CORRELATION

Distinct60
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.57758824
Minimum16
Maximum75
Zeros0
Zeros (%)0.0%
Memory size5.1 MiB
2022-03-22T13:07:24.180857image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile20
Q126
median40
Q357
95-th percentile75
Maximum75
Range59
Interquartile range (IQR)31

Descriptive statistics

Standard deviation17.46043237
Coefficient of variation (CV)0.4100850492
Kurtosis-1.146121776
Mean42.57758824
Median Absolute Deviation (MAD)15
Skewness0.3624556029
Sum28324698
Variance304.8666986
MonotocityNot monotonic
2022-03-22T13:07:24.303561image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
75342625.2%
 
23242083.6%
 
24236353.6%
 
25235093.5%
 
22222903.4%
 
26205103.1%
 
27177182.7%
 
21173112.6%
 
28168322.5%
 
29144372.2%
 
Other values (50)45053767.7%
 
ValueCountFrequency (%) 
1644780.7%
 
1729670.4%
 
1850930.8%
 
19101821.5%
 
20134802.0%
 
ValueCountFrequency (%) 
75342625.2%
 
7450480.8%
 
7351520.8%
 
7252340.8%
 
7161020.9%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
0
525692 
1
139557 
ValueCountFrequency (%) 
052569279.0%
 
113955721.0%
 
2022-03-22T13:07:24.388304image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

C_previous
Categorical

MISSING

Distinct4
Distinct (%)< 0.1%
Missing18711
Missing (%)2.8%
Memory size5.1 MiB
3
271160 
1
172007 
2
109184 
4
94187 
ValueCountFrequency (%) 
327116040.8%
 
117200725.9%
 
210918416.4%
 
49418714.2%
 
(Missing)187112.8%
 
2022-03-22T13:07:24.462107image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:24.538902image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:24.613701image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

duration_previous
Real number (ℝ≥0)

MISSING
ZEROS

Distinct16
Distinct (%)< 0.1%
Missing18711
Missing (%)2.8%
Infinite0
Infinite (%)0.0%
Mean6.003773947
Minimum0
Maximum15
Zeros24926
Zeros (%)3.7%
Memory size5.1 MiB
2022-03-22T13:07:24.699472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median5
Q39
95-th percentile15
Maximum15
Range15
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.680792915
Coefficient of variation (CV)0.7796417648
Kurtosis-0.6492855709
Mean6.003773947
Median Absolute Deviation (MAD)3
Skewness0.7522515321
Sum3881668
Variance21.90982232
MonotocityNot monotonic
2022-03-22T13:07:24.795216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%) 
18157012.3%
 
157984912.0%
 
27959512.0%
 
37080010.6%
 
4574858.6%
 
5493727.4%
 
6453796.8%
 
7377685.7%
 
8307524.6%
 
9262443.9%
 
Other values (6)8772413.2%
 
ValueCountFrequency (%) 
0249263.7%
 
18157012.3%
 
27959512.0%
 
37080010.6%
 
4574858.6%
 
ValueCountFrequency (%) 
157984912.0%
 
1497391.5%
 
13109631.6%
 
12112841.7%
 
11127181.9%
 

A
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
1
426067 
0
143691 
2
95491 
ValueCountFrequency (%) 
142606764.0%
 
014369121.6%
 
29549114.4%
 
2022-03-22T13:07:24.907915image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:24.977729image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:25.048572image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

B
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
0
363069 
1
302180 
ValueCountFrequency (%) 
036306954.6%
 
130218045.4%
 
2022-03-22T13:07:25.112401image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

C
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
3
271607 
1
202945 
2
133468 
4
57229 
ValueCountFrequency (%) 
327160740.8%
 
120294530.5%
 
213346820.1%
 
4572298.6%
 
2022-03-22T13:07:25.180217image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:25.244047image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:25.319814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

D
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
3
408839 
2
149793 
1
106617 
ValueCountFrequency (%) 
340883961.5%
 
214979322.5%
 
110661716.0%
 
2022-03-22T13:07:25.419548image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:25.490358image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:25.559175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

E
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
0
369085 
1
296164 
ValueCountFrequency (%) 
036908555.5%
 
129616444.5%
 
2022-03-22T13:07:25.876357image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

F
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
2
255806 
0
216395 
1
158613 
3
34435 
ValueCountFrequency (%) 
225580638.5%
 
021639532.5%
 
115861323.8%
 
3344355.2%
 
2022-03-22T13:07:25.951127image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:26.017948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:26.092748image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

G
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
2
265237 
3
191163 
1
141946 
4
66903 
ValueCountFrequency (%) 
226523739.9%
 
319116328.7%
 
114194621.3%
 
46690310.1%
 
2022-03-22T13:07:26.192481image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-03-22T13:07:26.267313image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:26.344076image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

cost
Real number (ℝ≥0)

Distinct531
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean635.7850083
Minimum260
Maximum922
Zeros0
Zeros (%)0.0%
Memory size5.1 MiB
2022-03-22T13:07:26.450790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum260
5-th percentile564
Q1605
median635
Q3665
95-th percentile712
Maximum922
Range662
Interquartile range (IQR)60

Descriptive statistics

Standard deviation45.99375791
Coefficient of variation (CV)0.0723416836
Kurtosis0.9558013252
Mean635.7850083
Median Absolute Deviation (MAD)30
Skewness0.09154557472
Sum422955341
Variance2115.425766
MonotocityNot monotonic
2022-03-22T13:07:26.576454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
63360240.9%
 
63759850.9%
 
64059350.9%
 
62659340.9%
 
63859160.9%
 
63959120.9%
 
62958960.9%
 
64258940.9%
 
63558750.9%
 
62358540.9%
 
Other values (521)60602491.1%
 
ValueCountFrequency (%) 
2601< 0.1%
 
2634< 0.1%
 
2641< 0.1%
 
2722< 0.1%
 
2741< 0.1%
 
ValueCountFrequency (%) 
9221< 0.1%
 
9171< 0.1%
 
9123< 0.1%
 
9111< 0.1%
 
9004< 0.1%
 

Interactions

2022-03-22T13:06:50.385454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:50.728378image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:51.048521image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:51.366672image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:51.681791image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:51.993989image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:52.319087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:52.634286image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:52.960412image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:53.288533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:53.607643image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:53.900861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:54.210033image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:54.520242image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:54.815453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:55.124628image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:55.416847image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:55.731006image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:56.028175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:56.341953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:56.642533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:56.934869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:57.244953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:57.537179image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:57.843323image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:58.163468image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:59.059996image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:59.370208image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:59.684573image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:06:59.989513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:00.307662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:00.613844image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:00.907060image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:01.225247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:01.536379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:01.853530image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:02.170683image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:02.468887image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:02.748140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:03.037368image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:03.331618image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:03.618812image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:03.912030image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:04.219209image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:04.524433image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:04.827582image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:05.156703image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:05.440944image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:05.730170image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:06.038758image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:06.343567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:06.672651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:06.965868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:07.278152image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:07.578255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:07.901402image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:08.215528image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:08.512733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:08.962567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:09.259737image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:09.544976image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:09.832206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:10.152362image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:10.451550image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:10.765745image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:11.065910image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:11.373087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:11.671291image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:11.973484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:12.295623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:12.617942image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:12.955858image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:13.280993image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:13.609111image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:13.924280image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:14.255385image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:14.572575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:14.884704image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:15.214851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:15.520004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:15.838185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-03-22T13:07:26.716081image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-22T13:07:26.945468image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-22T13:07:27.175852image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-22T13:07:27.413218image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-03-22T13:07:27.621661image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-03-22T13:07:16.739786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:17.968462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:19.894314image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-03-22T13:07:20.432874image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Sample

First rows

customer_IDshopping_ptrecord_typedaytimestatelocationgroup_sizehomeownercar_agecar_valuerisk_factorage_oldestage_youngestmarried_coupleC_previousduration_previousABCDEFGcost
01000000010008:35IN10001202g3.0464211.02.01022122633
11000000020008:38IN10001202g3.0464211.02.01022121630
21000000030008:38IN10001202g3.0464211.02.01022121630
31000000040008:39IN10001202g3.0464211.02.01022121630
41000000050011:55IN10001202g3.0464211.02.01022121630
51000000060011:57IN10001202g3.0464211.02.01022121638
61000000070011:58IN10001202g3.0464211.02.01022121638
71000000080012:03IN10001202g3.0464211.02.01022121638
81000000091012:07IN10001202g3.0464211.02.01022121634
91000000510308:56NY100061010e4.0282803.013.01133102755

Last rows

customer_IDshopping_ptrecord_typedaytimestatelocationgroup_sizehomeownercar_agecar_valuerisk_factorage_oldestage_youngestmarried_coupleC_previousduration_previousABCDEFGcost
6652391015272161410:14CT11888108eNaN232304.05.01033102716
6652401015272310215:13FL10711110f2.0393903.02.00021004656
6652411015272320215:14FL10711110f2.0393903.02.01033023687
6652421015272331110:30FL10711110g2.0393903.07.01033123651
6652431015272410313:42KY10204111eNaN202001.04.00033002642
6652441015272420313:43KY10204111eNaN202001.04.01023022677
6652451015272430313:43KY10204111eNaN202001.04.01023022677
6652461015272440313:44KY10204111eNaN202001.04.01023022677
6652471015272450313:46KY10204111eNaN202001.04.01023022685
6652481015272461115:14KY10204111dNaN202004.04.01033022681